home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CU Amiga Super CD-ROM 17
/
CU Amiga Magazine's Super CD-ROM 17 (1997)(EMAP Images)(GB)[!][issue 1997-12].iso
/
CUCD
/
Programming
/
DiceSource
/
doc
/
dprof.doc
< prev
next >
Wrap
Text File
|
1994-02-13
|
8KB
|
246 lines
dprof/dprof dprof/dprof
DPROF.DOC
DPROF <proffile> [-call]
The DPROF program generates profiling output from the binary data file
generated by an executable which was compiled with profiling enabled.
In order to use DPROF you must compile your program with the -prof
option. There are three levels of profiling:
-prof1 profile only your code (same as -prof with no numerical level)
-prof2 profile your code and the standard C library
-prof3 profile your code, the C library, and the amiga library tags
To use -prof2 you must have DLIB:CSP.LIB (small data profiled c.lib)
or DLIB:CSRP.LIB (small data profiled c.lib for registered arguments).
To use -prof3 you must have DLIB:AMIGASP20.LIB (small data profiled
amiga.lib) or DLIB:AMIGASRP20.LIB (small data profiled amiga.lib for
registerd arguments).
Other combinations are possible depending on the options you use for
linking. These libraries will not necessarily exist on DCC1:, they are
most likely lharc'd in DCC2:DLIB or DCC3:DLIB depending on how I've
arranged the distribution. Registered users can make other library
combinations directly from the library source.
USAGE WARNINGS
The profiling code is accurate to 20 microseconds under 2.0, 1/60
second under 1.3. The profiling code itself will slow down a program
by quite a bit but, in general, the system makes every attempt to
filter out its profiling overhead in the statistics file (so the grand
total time will be less then the actual amount of time the program took
to run).
Note, however, that the results will be skewed somewhat anyway, not
only due to the overhead of the profiling code, but also due to task
switches and other system overhead. To get accurate results you should
only run the executable that is to generate a .dprof file on an
unloaded system (i.e. don't do anything else while the executable is
running). Many calls to very short, quick routines will suffer the
most and numbers should be taken more in a qualitative fashion than a
quantitative fashion.
Keep in mind that it is not necessary to profile everything,
particulary for large projects. You may want most of the system to run
at full speed while only profiling a small part of it at a time.
EXAMPLE
EXAMPLE: -------------------------
void fubar1(void);
void fubar2(void);
void loop(long);
main(ac, av)
char *av[];
{
short i;
for (i = 0; i < 100; ++i) {
fubar1();
fubar2();
}
loop(10);
fubar1();
fubar2();
}
void
fubar1()
{
short j;
for (j = 0; j < 10000; ++j);
fubar2();
}
void
fubar2()
{
short j;
for (j = 0; j < 100; ++j);
}
void
loop(n)
{
if (n)
loop(n - 1);
}
----------------------------------
compile and the run the program, then dump the profile. the DPROF
program automatically appends the '.dprof' onto the filename you
specify.
1> dcc test.c -o test -prof1
1> test
1> dprof test
@($)DPROF V2.06.01 Sep 30 1991 test.dprof
GrandTotal: 539.53 mS
**** BY ROUTINE ****
_main calls=1 total= 539.53 mS (100.00%) local= 10.37 mS ( 1.92%)
_fubar1 calls=101 total= 517.45 mS ( 95.90%) local= 507.75 mS ( 94.10%)
_fubar2 calls=202 total= 20.44 mS ( 3.79%) local= 20.44 mS ( 3.79%)
_loop calls=11 total= 0.96 mS ( 0.17%) local= 0.96 mS ( 0.17%)
\_______/ \_________________________/ \__________________________/
total amount of time amount of time spent in the routine
the number of spent in the routing not including profiled subroutine
calls made to including all subroutine calls it may make.
the routine calls.
The percentage is relative to
Percentage is relative to the grand total run time
the grand total
**** BY PARENT ****
total number of calls made to fubar2()
and total running time, same as from
the first table
____________________________
/ \
_fubar2 calls=202 total= 20.44 mS
From _fubar1 calls=101 total= 9.69 mS ( 47.43%)
From _main calls=101 total= 10.75 mS ( 52.56%)
\_____________________/ \________________________/
number of calls made to time spent in fubar2() for calls
fubar2() from fubar1() and made from fubar1() and main(), adds
main() respectively up to the total on the first line.
_loop calls=11 total= 0.96 mS
From _main calls=1 total= 0.96 mS (100.00%)
From _loop calls=1 total= 0.04 mS ( 4.65%)
From _loop calls=1 total= 0.13 mS ( 13.95%)
From _loop calls=1 total= 0.22 mS ( 23.25%)
From _loop calls=1 total= 0.31 mS ( 32.55%)
From _loop calls=1 total= 0.40 mS ( 41.86%)
From _loop calls=1 total= 0.49 mS ( 51.16%)
From _loop calls=1 total= 0.60 mS ( 62.79%)
From _loop calls=1 total= 0.69 mS ( 72.09%)
From _loop calls=1 total= 0.78 mS ( 81.39%)
From _loop calls=1 total= 0.87 mS ( 90.69%)
**** COMBINED CALL TREE ****
Top line contains the same information from table 1
________________________________________________________
/ \
_main calls=1 tot= 539.53 (100.00) loc= 10.37 ( 1.92)
_fubar1 calls=101 tot= 517.45 ( 95.90) loc= 507.75 ( 94.10)
_fubar2 calls=101 tot= 10.75 ( 1.99) loc= 10.75 ( 1.99)
_loop calls=1 tot= 0.96 ( 0.17) loc= 0.08 ( 0.01)
\___________________________________________________________/
main() calls fubar1() 101 times, fubar1() takes 517 mS total
time over these calls. main() calls fubar2() directly 101
times and fubar2() takes 10 mS over these calls. Note that
fubar2()'s time is not the same as in table 1 because only
those calls made from main() are counted here.
Percentages are relative to main()
fubar1() calls fubar2() 101 times. percentages are relative
to fubar1()'s total time. Note that if you add fubar2()'s
number of calls and total time to the fubar2() entry above
you will get the grand total for fubar2() shown in the first
table.
_____________________________________________________
/ \
_fubar1 calls=101 tot= 517.45 (100.00) loc= 507.75 ( 98.12)
_fubar2 calls=101 tot= 9.69 ( 1.87) loc= 9.69 ( 1.87)
_fubar2 calls=202 tot= 20.44 (100.00) loc= 20.44 (100.00)
_loop calls=11 tot= 0.96 (100.00) loc= 0.96 (100.00)
<SELF> calls=10
\_____________/
The profiled data includes the entire call tree but for
simplicity, recursive calls are simply shown as above.
You can also request DPROF to print out the entire call tree. This is
done by adding the -call option to dprof. Note, however, that this
option may result in a huge amount of data printed out. On the
otherhand, much of the data is quite useful especially when tracing
subroutine stacking and other things.
1> dprof test -call
.....
**** CALL TREE ****
_main calls=1 tot= 539.53 (100.00) loc= 10.37 ( 1.92) {
_fubar1 calls=101 tot= 517.45 ( 95.90) loc= 507.75 ( 94.10) {
_fubar2 calls=101 tot= 9.69 ( 1.79) loc= 9.69 ( 1.79)
}
_fubar2 calls=101 tot= 10.75 ( 1.99) loc= 10.75 ( 1.99)
_loop calls=1 tot= 0.96 ( 0.17) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.87 ( 0.16) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.78 ( 0.14) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.69 ( 0.12) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.60 ( 0.11) loc= 0.11 ( 0.02) {
_loop calls=1 tot= 0.49 ( 0.09) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.40 ( 0.07) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.31 ( 0.05) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.22 ( 0.04) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.13 ( 0.02) loc= 0.08 ( 0.01) {
_loop calls=1 tot= 0.04 ( 0.00) loc= 0.04 ( 0.00)
}
}
}
}
}
}
}
}
}
}
}